8 research outputs found
Transportation in Social Media: an automatic classifier for travel-related tweets
In the last years researchers in the field of intelligent transportation
systems have made several efforts to extract valuable information from social
media streams. However, collecting domain-specific data from any social media
is a challenging task demanding appropriate and robust classification methods.
In this work we focus on exploring geo-located tweets in order to create a
travel-related tweet classifier using a combination of bag-of-words and word
embeddings. The resulting classification makes possible the identification of
interesting spatio-temporal relations in S\~ao Paulo and Rio de Janeiro
Characterizing Geo-located Tweets in Brazilian Megacities
This work presents a framework for collecting, processing and mining
geo-located tweets in order to extract meaningful and actionable knowledge in
the context of smart cities. We collected and characterized more than 9M tweets
from the two biggest cities in Brazil, Rio de Janeiro and S\~ao Paulo. We
performed topic modeling using the Latent Dirichlet Allocation model to produce
an unsupervised distribution of semantic topics over the stream of geo-located
tweets as well as a distribution of words over those topics. We manually
labeled and aggregated similar topics obtaining a total of 29 different topics
across both cities. Results showed similarities in the majority of topics for
both cities, reflecting similar interests and concerns among the population of
Rio de Janeiro and S\~ao Paulo. Nevertheless, some specific topics are more
predominant in one of the cities
A Biomedical Entity Extraction Pipeline for Oncology Health Records in Portuguese
Textual health records of cancer patients are usually protracted and highly
unstructured, making it very time-consuming for health professionals to get a
complete overview of the patient's therapeutic course. As such limitations can
lead to suboptimal and/or inefficient treatment procedures, healthcare
providers would greatly benefit from a system that effectively summarizes the
information of those records. With the advent of deep neural models, this
objective has been partially attained for English clinical texts, however, the
research community still lacks an effective solution for languages with limited
resources. In this paper, we present the approach we developed to extract
procedures, drugs, and diseases from oncology health records written in
European Portuguese. This project was conducted in collaboration with the
Portuguese Institute for Oncology which, besides holding over years of
duly protected medical records, also provided oncologist expertise throughout
the development of the project. Since there is no annotated corpus for
biomedical entity extraction in Portuguese, we also present the strategy we
followed in annotating the corpus for the development of the models. The final
models, which combined a neural architecture with entity linking, achieved
scores of , , and per cent in the mention extraction
of procedures, drugs, and diseases, respectively
Report on the Second International Workshop on Narrative Extraction from Texts (Text2Story 2019)
The Second International Workshop on Narrative Extraction from Texts (Text2Story’19 [http://text2story19.inesctec.pt/]) was held on the 14th of April 2019, in conjunction with the 41st European Conference on Information Retrieval (ECIR 2019) in Cologne, Germany. The workshop provided a platform for researchers in IR, NLP, and design and visualization to come together and share the recent advances in extraction and formal representation of narratives. The workshop consisted of two invited talks, ten research paper presentations, and a poster and demo session. The proceedings of the workshop are available online at http://ceur-ws.org/Vol-2342/info:eu-repo/semantics/publishedVersio
ECIR 2018: Text2Story Workshop-Narrative Extraction from Texts
The 1st International Workshop on Narrative Extraction from Texts (Text2Story 2018) was held in conjunction with the 40th European Conference on Information Retrieval, ECIR 2018, Grenoble on the 26th March 2018. The workshop aimed to help foster the collaboration of researchers on a wide range of multidisciplinary issues related to the text-to-narrative- structure. The program consisted of two keynote talks, six research presentations, a poster session and a slot for demo presentations. This report briefly summarizes the workshop.info:eu-repo/semantics/publishedVersio
Algorithmic Science News: support platform for science journalism
A plataforma Algorithmic Science News (ASN) é uma nova
ferramenta criada por uma equipa multidisciplinar que surgiu
da necessidade de reinscrever o papel das notícias sobre
ciência no espaço mediático presente. A plataforma tem
como objetivo aumentar o número de notícias científicas disponíveis
para os editores e reduzir o esforço associado a tarefas
mais demoradas como a recolha de dados e análise de
artigos científicos, facilitando todo o processo de produção
de notícias. A plataforma ASN agrega um conjunto de funcionalidades
destinadas a apoiar o trabalho habitual de um
jornalista em contextos redatoriais, permitindo a utilização
de documentos em repositórios científicos de acesso aberto.
O desenvolvimento de algoritmos que trabalham sobre estes
repositórios seguiu o propósito de facilitar o acesso e a exploração
destas coleções e permitir que os órgãos de comunicação
as utilizem como fonte informativa.
Neste projeto desenvolveram-se ferramentas de leitura e interpretação,
escrita e sugestão semântica. No que concerne à leitura e interpretação, o ASN permite encontrar especialistas
relacionados com o artigo científico, resumir as partes
mais determinantes, nomeadamente a introdução, objetivos,
metodologias e resultados, apresentar definições de termos
técnicos e sugerir projetos relacionados com o tema. No que
diz respeito à parte de escrita, a plataforma permite escrever
notas relacionadas com partes do artigo e ter acesso a sugestões
de frases. A utilização de uma versão beta desta plataforma
em contextos redatoriais permitirá perceber até que ponto
a automação de tarefas associadas à produção jornalística
poderá ajudar os meios de comunicação em transição para o
digital, assim como contribuir para uma maior eficiência nas
tarefas associadas ao jornalismo de ciência, permitindo uma
maior massificação e qualidade na produção de notícias científicas
no panorama mediático atual.The Algorithmic Science News (ASN) platform is a new tool
created by a multidisciplinary team that emerged from the
need to reinscribe the role of science news in today's media
space. The platform aims not only to increase the number of
scientific news available to publishers, but also to reduce the
effort associated with more time-consuming tasks such as
collecting data and analyzing scientific articles and thereby
facilitating the entire news production process. The ASN platform
includes a set of features designed to support the journalist's
standard procedure in writing contexts, allowing the
use of documents in open access scientific repositories. The
development of algorithms working on these repositories followed
the purpose of facilitating the access and exploration
of these collections and allowing the media to use them as an
information source. In this project, we developed tools of reading and interpretation,
writing and semantic suggestion. With regard to reading
and interpretation, ASN helps on finding specialists related
to the scientific article, summarizes the most determinant
parts, namely the introduction, objectives, methodologies
and results, present definitions of technical terms, and suggest
projects related to the topic. With regard to the writing
part, the platform allows to write notes related to parts of the
article and have access to phrases suggestions. The use of a
beta version of this platform in writing contexts will allow the
realization of the extent to which the automation of tasks associated
with journalistic production can help the media in
transition to digital, as well as to contribute to a greater efficiency
in the tasks associated with science journalism, allowing
a greater massification and quality in the production of
scientific news in the current media landscape